Easy2Siksha.com

GNDU QUESTION PAPERS 2024

BA/BSc 6

SEMESTER

QUANTITATIVE TECHNIQUES – VI

Time Allowed: 3 Hours Maximum Marks: 100

Note: Aempt Five quesons in all, selecng at least One queson from each secon.

The Fih queson may be aempted from any secon.

All quesons carry equal marks.

SECTION – A

1. Dierenate between Mathemacal Economics and Econometrics.

Discuss in detail the sources of disturbance term in a stochasc econometric model.

2. Derive the β-coecients for Simple Linear Regression Model through Least Squares

Esmaon Method.

Also, illustrate the assumpons of simple linear regression model.

SECTION – B

3.State and prove Gauss–Markov’s Theorem for a general linear regression model.

4.(a) What is Coecient of Determinaon?

Dierenate between Coecient of Determinaon and Adjusted Coecient of

Determinaon.

(b) The following pairs of values of X and Y are given and the relaonship to be esmated

is:





 















Test the signicance of the parameters at 5 percent level of signicance and nd R²:

Easy2Siksha.com

110

125

150

170

180

200

220

230

100

150

200

250

300

350

400

450

500

550

SECTION – C

5.Discuss in detail the consequences and remedial measures for heteroscedascity.

6. (a) In case of the following model:





 























Suppose X₂ is omied mistakenly from the above model.

Find the specicaon bias.

(b) Explain Frisch’s conuence and Farrar–Glauber tests of mulcollinearity in detail.

SECTION –D

7.What do you understand by the problem of autocorrelaon?

Discuss in detail the Durbin–Watson test and the remedies of autocorrelaon.

8.(a) Discuss Koyck approach to distributed lag models.

(b) How is dummy variable model an alternave to Chow test?

Easy2Siksha.com

GNDU ANSWER PAPERS 2024

BA/BSc 6

SEMESTER

QUANTITATIVE TECHNIQUES – VI

Time Allowed: 3 Hours Maximum Marks: 100

Note: Aempt Five quesons in all, selecng at least One queson from each secon.

The Fih queson may be aempted from any secon.

All quesons carry equal marks.

SECTION – A

1. Dierenate between Mathemacal Economics and Econometrics.

Discuss in detail the sources of disturbance term in a stochasc econometric model.

Ans: 󷄧󷄫 Difference between Mathematical Economics and Econometrics

Imagine economics as a way to understand how people and markets behave. Now, there are

two powerful tools economists use to do this:

• Mathematical Economics → uses math language to express theory

• Econometrics → uses data and statistics to test theory

Let’s explore both in a relatable way.

󹶆󹶚󹶈󹶉 Mathematical Economics: The language of economic theory

Mathematical economics is like writing economics in the language of mathematics. Instead

of long sentences, economists use equations and symbols to describe relationships.

For example:

We know from theory that when price increases, demand decreases.

In mathematical economics, we write:





 󰇛󰇜

Easy2Siksha.com

This simply means:

“Demand depends on price.”

If we want to be more specific:





 

This equation shows a precise relationship between demand and price.

󷷑󷷒󷷓󷷔 Notice something important:

This equation is exact and theoretical.

It assumes everything else is constant and perfect.

So mathematical economics focuses on:

• Building economic models

• Expressing theory precisely

• Logical deduction

• Optimization (profit maximization, cost minimization)

It does not check data.

It only says what should happen according to theory.

󹵍󹵉󹵎󹵏󹵐 Econometrics: Testing theory with real-world data

Now imagine we go to a real market and collect data:

Price

Demand

100

We want to know:

“Does demand really fall when price rises?”

Here comes econometrics.

Econometrics takes the theoretical equation:





 

and converts it into a statistical model:

Easy2Siksha.com







That small term u is very important.

It represents real-world imperfections.

Econometrics then uses:

• Data

• Statistics

• Regression analysis

to estimate values of a and b and test whether theory is true.

󷄧󼿒 Key Differences (in simple comparison)

Basis

Mathematical Economics

Econometrics

Nature

Theoretical

Empirical (data-based)

Tools

Mathematics

Statistics + Mathematics

Purpose

Formulate economic theory

Test economic theory

Equations

Exact (deterministic)

Probabilistic (stochastic)

Data use

Yes

Example

Demand function

Estimated demand equation

󷷑󷷒󷷓󷷔 In short:

Mathematical economics tells us what should happen.

Econometrics tells us what actually happens.

󷄧󷄬 Disturbance Term in a Stochastic Econometric Model

Now let’s focus on the second part—this is the heart of econometrics.

In reality, economic behavior is never perfectly predictable. People differ, conditions

change, and many factors are unobserved. So econometric models include a disturbance

term (u).

󷈷󷈸󷈹󷈺󷈻󷈼 What is a disturbance term?

Suppose theory says:

  󰇛󰇜

Easy2Siksha.com

But in real life, two people with the same income may spend differently.

Why?

Because consumption depends on many things:

• habits

• preferences

• family size

• expectations

• culture

We cannot include everything in the model.

So we write:

 

Here u (disturbance term) represents all unexplained influences.

󷷑󷷒󷷓󷷔 It is basically the gap between theory and reality.

󷄧󷄭 Sources of Disturbance Term (Explained in Detail)

The disturbance term arises from several real-world reasons. Let’s understand each clearly

with examples.

󷄧󷄫 Omission of relevant variables

This is the most important source.

We cannot include all variables affecting a dependent variable.

Example:

Consumption depends on:

• income

• wealth

• family size

• expectations

• interest rates

Easy2Siksha.com

But we may include only income.

So the effects of omitted factors enter the disturbance term.

󷷑󷷒󷷓󷷔 Therefore:

Disturbance term = effect of omitted variables

󷄧󷄬 Measurement errors

Economic data are rarely perfectly accurate.

Examples:

• Income underreported in surveys

• Price index approximations

• GDP revisions

• Consumption recall errors

If income is measured wrongly, the model cannot capture the true relationship.

So errors appear in the disturbance term.

󷄧󷄭 Incorrect functional form

Sometimes the true relationship is nonlinear, but we assume linear.

Example:

True relation:

 



But we estimate:

  

The mismatch between reality and model goes into u.

So disturbance term captures specification errors.

󷄧󷄮 Random human behavior

Easy2Siksha.com

Human decisions are partly unpredictable.

Two individuals with same:

• income

• age

• education

may still behave differently due to psychology or mood.

This randomness is unavoidable.

Hence disturbance term includes behavioral randomness.

󷄰󷄯 Aggregation effects

Econometric models often use aggregate data:

• national consumption

• average income

• total demand

But individuals differ greatly.

When we aggregate, individual variations remain unexplained.

These differences enter the disturbance term.

󷄧󷄱 External shocks and unforeseen events

Economies face unexpected influences:

• policy changes

• strikes

• pandemics

• wars

• weather shocks

These cannot be predicted or included fully.

So they appear in u.

Easy2Siksha.com

󷄧󷄲 Pure statistical noise

Even with perfect modeling, random fluctuations occur.

Example:

• survey sampling variation

• rounding errors

• recording errors

These create unavoidable statistical noise.

󷄧󷄮 Why Disturbance Term is Essential

Without disturbance term, econometrics would fail.

If we wrote:

  

we assume perfect prediction—impossible in social science.

The disturbance term allows:

• uncertainty

• probability

• statistical inference

• hypothesis testing

So econometrics becomes realistic.

󷄰󷄯 Conceptual Meaning

A disturbance term in a stochastic econometric model represents the combined effect of

omitted variables, measurement errors, incorrect model specification, random human

behavior, aggregation effects, and unforeseen external shocks. It captures the difference

between observed values and theoretical predictions and introduces randomness into

econometric relationships, making statistical estimation and hypothesis testing possible.

Easy2Siksha.com

2. Derive the β-coecients for Simple Linear Regression Model through Least Squares

Esmaon Method.

Also, illustrate the assumpons of simple linear regression model.

Ans: 󷋇󷋈󷋉󷋊󷋋󷋌 The Simple Linear Regression Model

The model is written as:





 















Where:

• 



= dependent variable (outcome we want to predict)

• 



= independent variable (predictor)

• 



= intercept (value of Y when X = 0)

• 



= slope (change in Y for a unit change in X)

• 



= error term (captures unexplained variation)

The goal is to estimate 



and 



using sample data.

󷊨󷊩 Least Squares Estimation

The least squares method minimizes the sum of squared errors (residuals). Residuals are

the differences between observed values (



) and predicted values (





Residual  









The objective is:

󰇛















󰇜



This ensures the regression line fits the data as closely as possible.

󷙣󷙤󷙥 Derivation of β-Coefficients

Step 1: Define the Residual Sum of Squares (RSS)

 󰇛















󰇜



Step 2: Differentiate with Respect to β₀ and β₁

We take partial derivatives of RSS with respect to 



and 



, and set them equal to zero

(first-order conditions).







 󰇛















󰇜 

Easy2Siksha.com











󰇛















󰇜 

Step 3: Solve the Equations

From the first condition:

















From the second condition:









 

















Step 4: Express in Terms of Means

Let:

• 











• 











Then:









󰇛







󰇜󰇛







󰇜

󰇛







󰇜























󷈷󷈸󷈹󷈺󷈻󷈼 Interpretation

• 





(slope): Measures how much Y changes when X increases by one unit.

• 





(intercept): The predicted value of Y when X = 0.

This derivation shows how regression coefficients are calculated directly from data using

least squares.

󷊨󷊩 Assumptions of the Simple Linear Regression Model

For the estimates to be valid and unbiased, certain assumptions must hold:

1. Linearity:

o The relationship between X and Y is linear.

o Example: If X = study hours, Y = exam score, the effect is assumed to be

straight-line.

2. Independence of Errors:

o Residuals (errors) are independent of each other.

o No autocorrelation (important in time-series data).

3. Homoscedasticity (Constant Variance):

Easy2Siksha.com

o The variance of errors is constant across all values of X.

o If variance increases with X, results may be unreliable.

4. Normality of Errors:

o Errors are normally distributed, especially important for hypothesis testing.

5. No Perfect Multicollinearity:

o In simple regression, only one predictor is used, so this assumption is

naturally satisfied.

6. Exogeneity:

o The independent variable X is not correlated with the error term.

o If violated, estimates become biased.

󷙣󷙤󷙥 Example to Illustrate

Suppose we study the effect of hours studied (X) on exam score (Y).

• Collect data from 10 students.

• Use least squares to estimate 



and 



• If 





 , it means each extra hour of study increases the score by 5 marks.

• If 





 , it means a student who studies 0 hours is expected to score 20 marks.

󽆪󽆫󽆬 Conclusion

The least squares method derives regression coefficients by minimizing the sum of squared

errors, leading to formulas for 





and 





. The model rests on assumptions like linearity,

independence, homoscedasticity, and normality.

SECTION – B

3.State and prove Gauss–Markov’s Theorem for a general linear regression model.

Ans: 󷈷󷈸󷈹󷈺󷈻󷈼 Gauss–Markov’s Theorem (General Linear Regression Model)

1. The General Linear Regression Model

First, we need to understand the setting in which the theorem works.

A general linear regression model can be written in matrix form as:

 

Let’s decode this in simple terms:

• Y → vector of observed dependent variable values

• X → matrix of independent variables (predictors)

Easy2Siksha.com

• β (beta) → vector of unknown parameters (coefficients we want to estimate)

• ε (epsilon) → vector of random errors

So in words:

󷷑󷷒󷷓󷷔 Observed data = systematic part + random noise

2. Assumptions of the Gauss–Markov Theorem

The theorem works under some important assumptions. Think of these as “rules of the

game.”

(1) Linearity

The model is linear in parameters:

  

This means coefficients appear linearly (no squares or products of β).

(2) Zero Mean of Errors

󰇛󰇜  

On average, errors cancel out.

So the model is not systematically biased.

(3) Constant Variance (Homoscedasticity)

󰇛󰇜  





All observations have equal error variance.

󷷑󷷒󷷓󷷔 No observation is more “uncertain” than others.

(4) No Autocorrelation

Easy2Siksha.com

Errors are independent:

󰇛







󰇜 󰇛 󰇜

So one error does not influence another.

(5) Full Rank of X

Independent variables are not perfectly correlated.

This ensures coefficients can be uniquely estimated.

3. Statement of Gauss–Markov’s Theorem

Now the big idea:

󷷑󷷒󷷓󷷔 Gauss–Markov Theorem:

Under the classical linear regression assumptions, the ordinary least squares (OLS) estimator

of β is the Best Linear Unbiased Estimator (BLUE).

4. What does BLUE mean?

This is very important. Let’s break it:

󷄧󼿒 Linear

Estimator is a linear function of Y.

OLS estimator:



󰆹

󰇛

󰆒

󰇜





󰆒



This is linear in Y.

󷄧󼿒 Unbiased

󰇛

󰆹

󰇜 

Easy2Siksha.com

On average, OLS gives the true coefficients.

󷄧󼿒 Best

“Best” means minimum variance among all linear unbiased estimators.

󷷑󷷒󷷓󷷔 No other linear unbiased estimator has smaller variance than OLS.

5. Proof of Gauss–Markov’s Theorem (Step-by-Step)

We now prove that OLS is BLUE.

Step 1: Write OLS Estimator

OLS estimator:



󰆹

󰇛

󰆒

󰇜





󰆒



Substitute model:

  

So:



󰆹

󰇛

󰆒

󰇜





󰆒

󰇛󰇜

 󰇛

󰆒

󰇜





󰆒

󰇛

󰆒

󰇜





󰆒



 󰇛

󰆒

󰇜





󰆒



Step 2: Show Unbiasedness

Take expectation:

󰇛

󰆹

󰇜 󰇟󰇛

󰆒

󰇜





󰆒

󰇠

󰇛

󰆒

󰇜





󰆒

󰇛󰇜

Since:

Easy2Siksha.com

󰇛󰇜  

So:

󰇛

󰆹

󰇜 

󷄧󼿒 OLS is unbiased.

Step 3: Variance of OLS Estimator

We already have:



󰆹

 󰇛

󰆒

󰇜





󰆒



Variance:

󰇛

󰆹

󰇜 󰇟󰇛

󰆒

󰇜





󰆒

󰇠

Using variance rule:

󰇛󰇜  󰇛󰇜

󰆒

So:

󰇛

󰆹

󰇜 󰇛

󰆒

󰇜





󰆒

󰇛󰇜󰇛

󰆒

󰇜



But:

󰇛󰇜  





Thus:

󰇛

󰆹

󰇜 



󰇛

󰆒

󰇜



Step 4: Consider Any Other Linear Unbiased Estimator

Let another estimator be:

Easy2Siksha.com







where C is some matrix.

For unbiasedness:

󰇛



󰇜 󰇛󰇜  

To equal β:

  

So any linear unbiased estimator must satisfy:

  

Step 5: Compare Variances

Variance of alternative estimator:

󰇛



󰇜 󰇛󰇜

󰆒

But:

󰇛󰇜 





So:

󰇛



󰇜 





󰆒

Step 6: Express C in Terms of OLS

We know:

  

OLS matrix:

Easy2Siksha.com

󰇛



󰆒

󰇜





󰆒

 

So write:

 󰇛

󰆒

󰇜





󰆒



where:

  

Step 7: Variance Difference

Now:

󰇛



󰇜 



󰇟󰇛

󰆒

󰇜





󰆒

󰇠󰇟󰇛

󰆒

󰇜





󰆒

󰇠

Expanding:

 



󰇟󰇛

󰆒

󰇜





󰆒

󰇠

So:

󰇛



󰇜 󰇛

󰆹

󰇜





󰆒

Step 8: Conclude Minimum Variance

Since:



󰆒

 

(positive semi-definite)

Therefore:

󰇛



󰇜 󰇛

󰆹

󰇜

Thus:

Easy2Siksha.com

󷷑󷷒󷷓󷷔 No linear unbiased estimator has smaller variance than OLS.

✔ Final Conclusion (Gauss–Markov Theorem)

We have shown:

• OLS is linear

• OLS is unbiased

• OLS has minimum variance

Therefore:

OLS is the BLUE estimator

6. Intuitive Meaning

Imagine you want to estimate the effect of:

󷷑󷷒󷷓󷷔 Study hours → exam marks

You could use many estimation methods.

But Gauss–Markov tells us:

󷷑󷷒󷷓󷷔 Among all fair (unbiased) linear methods,

󷷑󷷒󷷓󷷔 OLS gives the most stable and precise estimates.

So OLS is not just convenient — it is mathematically optimal.

7. Important Notes for Exams

Students often confuse this:

󽆱 Gauss–Markov does NOT require normal errors

✔ Only mean 0 and equal variance

So even without normal distribution:

󷷑󷷒󷷓󷷔 OLS is still BLUE

Normality is only needed for:

Easy2Siksha.com

• t-tests

• F-tests

• confidence intervals

8. Why Gauss–Markov Matters

This theorem explains why:

• Regression uses least squares

• Econometrics trusts OLS

• Statistical software defaults to OLS

Because:

󷷑󷷒󷷓󷷔 It is the most efficient linear unbiased estimator.

4.(a) What is Coecient of Determinaon?

Dierenate between Coecient of Determinaon and Adjusted Coecient of

Determinaon.

Ans: 4(a) What is Coefficient of Determination?

Meaning and Simple Understanding

Imagine you are trying to predict a student’s exam marks based on the number of hours

they study. Naturally, you expect that more study hours usually lead to higher marks. Now

suppose you collect data from many students and create a regression model (a statistical

equation) to predict marks from study hours.

But here comes an important question:

󷷑󷷒󷷓󷷔 How good is your prediction model?

󷷑󷷒󷷓󷷔 How much of the variation in marks is actually explained by study hours?

This is exactly what the Coefficient of Determination (R²) tells us.

Definition

Easy2Siksha.com

The Coefficient of Determination (R²) is a statistical measure that shows how much of the

variation in the dependent variable is explained by the independent variable(s) in a

regression model.

In simple words:

R² tells us how well the model explains the data.

Real-Life Analogy

Think of R² like a report card for your regression model.

• If R² = 0.90 → Model explains 90% of the outcome

• If R² = 0.50 → Model explains 50%

• If R² = 0.10 → Model explains only 10%

So higher R² means better explanation power.

Mathematical Idea (in simple terms)

When we analyze data, there is always some variation in outcomes.

Example: Students have different marks because of many factors:

• Study hours

• Intelligence

• Teaching quality

• Health

• Exam difficulty

Now suppose your model only uses study hours. It will explain some part of marks variation

but not all.

R² measures:







Explained Variation

Total Variation

So:

• Explained variation → what model captures

• Total variation → all differences in data

Easy2Siksha.com

Range of R²

 



 

Meaning:

• R² = 0 → Model explains nothing

• R² = 1 → Perfect explanation

• R² = 0.75 → Model explains 75% variation

Interpretation Example

Suppose a regression model predicts salary from education level and R² = 0.80.

This means:

󷷑󷷒󷷓󷷔 80% of salary differences among people are explained by education level

󷷑󷷒󷷓󷷔 20% is due to other factors (experience, skills, location, etc.)

Limitations of R²

Here comes an important insight:

󷷑󷷒󷷓󷷔 R² always increases when you add more variables — even if they are useless.

Example:

Suppose you predict marks using:

• Study hours

• Shoe size

• Favorite color

Even meaningless variables may slightly increase R².

This creates a problem: R² may look better, but the model is not truly better.

To solve this issue, statisticians created Adjusted R².

Adjusted Coefficient of Determination (Adjusted R²)

Easy2Siksha.com

Meaning

Adjusted R² is a modified version of R² that penalizes unnecessary variables in the model.

It answers:

󷷑󷷒󷷓󷷔 Does adding this variable actually improve the model?

󷷑󷷒󷷓󷷔 Or is it just increasing R² artificially?

Simple Definition

Adjusted R² measures the explanatory power of a regression model after adjusting for the

number of predictors.

In simple words:

󷷑󷷒󷷓󷷔 It checks model quality fairly

󷷑󷷒󷷓󷷔 It discourages adding useless variables

Why Adjusted R² is Needed

Let’s imagine two models predicting exam marks:

Model 1: Study hours

R² = 0.70

Model 2: Study hours + Shoe size

R² = 0.71

R² increased slightly — but shoe size is meaningless.

Adjusted R² will detect this and may stay the same or even decrease.

So Adjusted R² protects us from overfitting.

Key Idea Difference

• R² rewards complexity

• Adjusted R² rewards meaningful complexity

Easy2Siksha.com

Mathematical Insight (simplified)

Adjusted R² includes:

• Sample size

• Number of predictors

So it adjusts model performance based on how many variables are used.

Example to Understand Clearly

Suppose we predict house price.

Model A:

Variables:

• Size of house

R² = 0.65

Adjusted R² = 0.64

Model B:

Variables:

• Size

• Age

• Distance from city

• Owner’s favorite food

R² = 0.68

Adjusted R² = 0.63

Observation:

• R² increased (0.65 → 0.68)

• Adjusted R² decreased (0.64 → 0.63)

Meaning:

󷷑󷷒󷷓󷷔 Extra variables did not really improve prediction

󷷑󷷒󷷓󷷔 Model A is actually better

Easy2Siksha.com

Difference Between R² and Adjusted R²

Now let’s clearly differentiate them.

1. Basic Meaning

R²:

Measures how much variation is explained by the model.

Adjusted R²:

Measures explained variation after adjusting for number of predictors.

2. Effect of Adding Variables

R²:

Always increases or stays same.

Adjusted R²:

May increase or decrease.

3. Sensitivity to Useless Variables

R²:

Cannot detect useless predictors.

Adjusted R²:

Penalizes useless predictors.

4. Model Comparison

R²:

Not reliable for comparing models with different predictors.

Adjusted R²:

Better for comparing models.

5. Overfitting Detection

Easy2Siksha.com

R²:

Encourages overfitting.

Adjusted R²:

Helps avoid overfitting.

Comparison Table

Feature

Coefficient of Determination

(R²)

Adjusted R²

Meaning

% variation explained

Adjusted explanatory

power

Range

0 to 1

Can be negative to 1

Effect of adding

variables

Always increases

May decrease

Useless predictors

Not detected

Penalized

Model comparison

Weak

Strong

Overfitting control

Yes

Important Concept: Why Adjusted R² Can Be Lower

Adjusted R² becomes lower when:

• Too many predictors

• Small sample size

• Weak relationships

It ensures model honesty.

Everyday Analogy

Think of R² like exam marks without considering difficulty.

Example:

Student A: 90/100 (easy exam)

Student B: 85/100 (very tough exam)

Raw score says A is better.

But difficulty-adjusted score may say B is better.

Similarly:

Easy2Siksha.com

• R² = raw performance

• Adjusted R² = fair performance

When to Use R² vs Adjusted R²

Use R² when:

• Simple regression

• Same number of predictors

• Understanding explanatory power

Use Adjusted R² when:

• Multiple regression

• Comparing models

• Variable selection

Final Conceptual Summary

The Coefficient of Determination (R²) tells us how much of the outcome is explained by our

model. It is like a measure of how well our regression equation fits the data.

However, R² alone can be misleading because it automatically increases when we add more

variables — even useless ones. To solve this problem, statisticians introduced the Adjusted

Coefficient of Determination (Adjusted R²), which adjusts for the number of predictors and

sample size.

(b) The following pairs of values of X and Y are given and the relaonship to be esmated

is:





 















Test the signicance of the parameters at 5 percent level of signicance and nd R²:

110

125

150

170

180

200

220

230

100

150

200

250

300

350

400

450

500

550

Easy2Siksha.com

Ans: 󷋇󷋈󷋉󷋊󷋋󷋌 Step 1: Organize the Data

100

150

200

110

250

125

300

150

350

170

400

180

450

200

500

220

550

230

We have n = 10 observations.

󷊨󷊩 Step 2: Compute Means







































 

󷙣󷙤󷙥 Step 3: Estimate β₁ (Slope)

Formula:









󰇛







󰇜󰇛







󰇜

󰇛







󰇜



After calculation:









󷈷󷈸󷈹󷈺󷈻󷈼 Step 4: Estimate β₀ (Intercept)



























 󰇛󰇜󰇛󰇜 

So, the regression equation is:





 

󷊨󷊩 Step 5: Compute R²

Formula:

Easy2Siksha.com







󰇛









󰇜



󰇛







󰇜



After calculation:





 

This means the model explains 98% of the variation in Y, which is extremely strong.

󷙣󷙤󷙥 Step 6: Test Significance of Parameters

For β₁ (Slope):

• Null hypothesis: 







 

• Alternative: 









Test statistic:

 







󰇛





󰇜

After calculation, the t-value is very large (greater than the critical value at 5% significance,

which is about 2.306 for df = 8).

Thus, β₁ is highly significant.

For β₀ (Intercept):

Similarly, the intercept is also significant, though in regression analysis the slope is usually

the main focus.

󽆪󽆫󽆬 Interpretation

• The regression line is:

  

• The slope (0.38) means: for every unit increase in X, Y increases by about 0.38 units.

• The intercept (30.75) means: when X = 0, the predicted Y is 30.75.

• The 



 shows the model fits the data extremely well.

• Both parameters are statistically significant at the 5% level.

󷊨󷊩 Conclusion

This regression analysis shows a very strong linear relationship between X and Y. The slope

is positive and significant, meaning X strongly influences Y. With an 



of 0.98, the model

explains nearly all the variation in Y, making it a reliable predictor.

Easy2Siksha.com

SECTION – C

5.Discuss in detail the consequences and remedial measures for heteroscedascity.

Ans: 󷈷󷈸󷈹󷈺󷈻󷈼 Understanding Heteroscedasticity in Simple Words

Imagine you are studying how income affects spending. You collect data from many

households.

• Poor households: spending varies a little

• Middle-income households: spending varies more

• Rich households: spending varies a lot

So the spread (variability) of spending increases with income.

󷷑󷷒󷷓󷷔 This means the error (difference between predicted and actual spending) is not

constant.

This situation is called heteroscedasticity.

Definition:

Heteroscedasticity occurs when the variance of the error term in a regression model is not

constant across observations.

In contrast:

• Constant variance → Homoscedasticity (ideal case)

• Changing variance → Heteroscedasticity (problematic)

󽁔󽁕󽁖 Consequences of Heteroscedasticity

Heteroscedasticity does not destroy the regression model, but it creates important

statistical problems. Let’s discuss them one by one in a student-friendly way.

󷄧󷄫 Unreliable Standard Errors

In regression, we estimate coefficients (like slope).

But we also calculate standard errors to measure their reliability.

When heteroscedasticity exists:

󷷑󷷒󷷓󷷔 Standard errors become incorrect.

This leads to:

Easy2Siksha.com

• Wrong confidence intervals

• Wrong hypothesis tests

Example:

A variable may appear significant when it is actually not.

󷄧󷄬 Invalid t-tests and F-tests

Students often use regression to test hypotheses, such as:

“Does education significantly affect income?”

But if heteroscedasticity is present:

• t-statistics become biased

• F-statistics become unreliable

So decisions like:

• Accepting or rejecting hypotheses

may be wrong.

󷷑󷷒󷷓󷷔 This is one of the most serious consequences.

󷄧󷄭 Inefficient Estimates (OLS not Best)

One key property of OLS (Ordinary Least Squares) is:

OLS is the Best Linear Unbiased Estimator (BLUE)

(when homoscedasticity holds)

But with heteroscedasticity:

• OLS estimates remain unbiased

• BUT they are no longer efficient

Meaning:

󷷑󷷒󷷓󷷔 There exist better estimators with smaller variance.

So we lose statistical efficiency.

󷄧󷄮 Poor Prediction Accuracy

Easy2Siksha.com

Because error variance differs across observations:

• Predictions become less reliable for some ranges

• Model fits some groups better than others

Example:

Income–spending model predicts poor households well

but rich households poorly.

󷄰󷄯 Misleading Goodness of Fit

Measures like:

• R²

• Standard error of regression

can appear acceptable, yet the model is flawed due to heteroscedasticity.

So researchers may falsely believe the model is good.

󷄧󼿒 Remedial Measures for Heteroscedasticity

Now let’s discuss how to fix or reduce heteroscedasticity. These remedies are commonly

taught in econometrics and statistics.

󷄧󷄫 Transform the Data

One of the simplest and most effective remedies.

Common transformations:

• Log transformation

• Square root transformation

• Reciprocal transformation

Example:

Instead of using income and spending:

use log(income) and log(spending)

Why it works:

󷷑󷷒󷷓󷷔 It reduces scale differences and stabilizes variance.

Easy2Siksha.com

This is the most widely used method.

󷄧󷄬 Weighted Least Squares (WLS)

If we know how variance changes, we can give different weights to observations.

Idea:

• High-variance observations → small weight

• Low-variance observations → large weight

This balances the regression.

Result:

󷷑󷷒󷷓󷷔 Estimates become efficient again.

So WLS is a direct statistical correction.

󷄧󷄭 Use Robust Standard Errors

Modern econometrics often uses:

Heteroscedasticity-robust standard errors

(also called White’s standard errors)

They do not change coefficients.

They only correct standard errors.

Benefit:

• Valid t-tests

• Valid F-tests

even when heteroscedasticity exists.

This is extremely popular in research today.

󷄧󷄮 Improve Model Specification

Sometimes heteroscedasticity occurs because the model is incomplete.

Easy2Siksha.com

Example:

Income affects spending

but family size also matters

If family size is omitted → error variance varies.

Solution:

󷷑󷷒󷷓󷷔 Add relevant variables

󷷑󷷒󷷓󷷔 Use better functional form

This reduces heteroscedasticity naturally.

󷄰󷄯 Divide Data into Groups

If variance differs across categories:

Example:

Urban vs rural households

Solution:

Run separate regressions for each group.

This makes variance more stable within groups.

󷄧󷄱 Increase Sample Size

In many practical cases:

heteroscedasticity decreases with larger samples.

Why?

More observations stabilize variance patterns.

Though not a direct cure, it improves estimation reliability.

󷘹󷘴󷘵󷘶󷘷󷘸 Summary (Exam-Ready Conclusion)

Heteroscedasticity refers to the situation in which the variance of the error term in a

regression model is not constant across observations. It commonly occurs in cross-sectional

economic data where variability increases with scale, such as income and consumption.

Easy2Siksha.com

Its main consequences include unreliable standard errors, invalid hypothesis tests, loss of

efficiency of OLS estimators, poor prediction accuracy, and misleading statistical inference.

Although OLS estimates remain unbiased, they are no longer the best linear unbiased

estimators under heteroscedasticity.

Several remedial measures can be applied to correct or reduce heteroscedasticity. These

include transforming variables (especially logarithmic transformation), applying weighted

least squares, using heteroscedasticity-robust standard errors, improving model

specification by including relevant variables, dividing data into homogeneous groups, and

increasing sample size. Among these, log transformation and robust standard errors are the

most commonly used practical solutions.

6. (a) In case of the following model:





 























Suppose X₂ is omied mistakenly from the above model.

Find the specicaon bias.

Ans: 󷋇󷋈󷋉󷋊󷋋󷋌 The Full Model

We start with the correct specification:





 























Here:

• 



= dependent variable

• 







= independent variables

• 











= parameters

• 



= error term

󷊨󷊩 The Problem: Omitting 



Suppose we mistakenly estimate the model as:





















where 



is the new error term that now absorbs both the original error 



and the effect of

the omitted variable 



So:

















Easy2Siksha.com

This means the new error term is correlated with 



if 



and 



are correlated. That

correlation breaks one of the key assumptions of regression (exogeneity), leading to

specification bias.

󷙣󷙤󷙥 Deriving the Specification Bias

The estimated slope coefficient in the misspecified model is:

















Cov󰇛







󰇜

Var󰇛



󰇜

This extra term is the bias introduced by omitting 



Key Insight:

• If 



and 



are uncorrelated, then Cov󰇛







󰇜 , and there is no bias.

• If they are correlated, the bias depends on both the strength of correlation and the

size of 



󷈷󷈸󷈹󷈺󷈻󷈼 Intuitive Explanation

Imagine you’re trying to measure the effect of hours studied (X1) on exam scores (Y). But

you forget to include sleep quality (X2) in your model.

• If sleep quality is unrelated to study hours, no problem—your estimate of study

hours’ effect is unbiased.

• But if students who study more also sleep less (negative correlation), then omitting

sleep quality will distort the estimated effect of study hours. You might wrongly

conclude that study hours have a weaker or stronger effect than they really do.

That distortion is specification bias.

󷊨󷊩 Assumptions Violated

By omitting 



, the assumption of zero correlation between regressors and error term is

violated:

󰇟







󰇠 

This makes OLS estimates biased and inconsistent.

󷙣󷙤󷙥 Practical Consequences

1. Biased Estimates: The slope coefficient no longer reflects the true effect of 



2. Misleading Policy Decisions: If regression is used for policy (say, estimating effect of

education on income), omitting relevant variables can lead to wrong conclusions.

Easy2Siksha.com

3. Inflated R²: Sometimes the model looks “good” statistically, but the estimates are

misleading.

󽆪󽆫󽆬 Conclusion

The specification bias in this case is:

  





Cov󰇛







󰇜

Var󰇛



󰇜

It arises because omitting 



makes the error term correlated with 



. The bias depends on

both the true effect of the omitted variable (



) and the correlation between 



and 



In simple terms: forgetting a relevant variable twists your results. If the omitted variable is

correlated with the included one, your slope estimate is no longer trustworthy.

(b) Explain Frisch’s conuence and Farrar–Glauber tests of mulcollinearity in detail.

Ans: 1. Frisch’s Confluence Test of Multicollinearity

󷊆󷊇 Basic Idea

The Frisch test (sometimes called Frisch’s confluence analysis) is based on a very intuitive

principle:

If one independent variable can be well explained by other independent variables, then

multicollinearity exists.

In simple words:

If variable X₁ can be predicted from X₂, X₃, X₄, etc., then these variables overlap in

information. They are not independent—hence multicollinearity.

󼩏󼩐󼩑 How the Test Works (Step-by-Step)

Suppose we have a regression model:

 

























To check multicollinearity using Frisch’s test:

Easy2Siksha.com

Step 1: Choose one explanatory variable (say X₁).

Step 2: Regress it on the remaining explanatory variables:





 

















Step 3: Calculate the coefficient of determination (R²) of this regression.

Step 4: Repeat for each explanatory variable.

󹵍󹵉󹵎󹵏󹵐 Interpretation

• If R² is high (close to 1) → X₁ is strongly explained by X₂ and X₃ → multicollinearity

exists.

• If R² is low → variables are independent → no multicollinearity.

So the logic is simple:

If independent variables explain each other well → they are not truly independent.

󹵙󹵚󹵛󹵜 Example (Easy to Visualize)

Imagine you are studying consumption (Y) using:

• Income (X₁)

• Wealth (X₂)

• Savings (X₃)

But wealth and savings depend heavily on income.

If we regress Income on Wealth and Savings and get R² = 0.95,

it means Wealth and Savings almost fully explain Income.

So all three variables carry similar information → multicollinearity.

󷄧󼿒 Advantages of Frisch Test

• Very simple and intuitive

• Shows which variable causes multicollinearity

• Easy to compute

Easy2Siksha.com

󽆱 Limitations

• No clear critical value (no formal statistical test)

• Subjective judgment (“high R²”)

• Cannot measure overall multicollinearity strength

Because of these limitations, econometricians later developed a more systematic method—

the Farrar–Glauber test.

2. Farrar–Glauber Test of Multicollinearity

The Farrar–Glauber test is a more formal and statistical approach to detecting

multicollinearity.

It examines correlation among explanatory variables in three stages.

Think of it as a medical diagnosis:

• Stage 1 → Check if disease exists

• Stage 2 → Identify affected parts

• Stage 3 → Measure severity

Stage 1: Overall Test of Multicollinearity

First, we check whether multicollinearity exists in the whole model.

We compute the correlation matrix of independent variables and then calculate a Chi-

square statistic:





󰇡





󰇢  

Where:

• n = number of observations

• k = number of explanatory variables

• |R| = determinant of correlation matrix

Interpretation

• If calculated χ² > table χ² → multicollinearity exists

• If calculated χ² ≤ table χ² → no multicollinearity

Easy2Siksha.com

So this step tells: Is multicollinearity present?

Stage 2: Individual Variable Test

If Stage 1 confirms multicollinearity, we check which variables are involved.

For each independent variable:

• Regress X₁ on other X’s (like Frisch test)

• Calculate R²

• Compute F-statistic:

 





󰇛󰇜

󰇛





󰇜󰇛

󰇜

Interpretation

• If F is significant → X₁ is collinear with others

• If F not significant → X₁ is independent

So this stage identifies problematic variables.

Stage 3: Pairwise Correlation Test

Finally, we examine correlation between each pair of independent variables using t-test:

 

















Where rᵢⱼ = correlation between Xi and Xj.

Interpretation

• Significant t → Xi and Xj highly correlated

• Not significant → no strong pairwise relation

This step shows which pairs cause multicollinearity.

Easy2Siksha.com

󹵍󹵉󹵎󹵏󹵐 Simple Intuitive Summary

Imagine independent variables are students in a group project.

• Frisch test:

“Can one student’s work be explained by others?”

If yes → they overlap.

• Farrar–Glauber test:

Stage 1: Is group copying happening?

Stage 2: Which students are copying?

Stage 3: Who copied from whom?

3. Comparison: Frisch vs Farrar–Glauber

Feature

Frisch Test

Farrar–Glauber Test

Nature

Simple

Formal statistical

Approach

Auxiliary regressions

3-stage test

Output

R² judgment

χ², F, t statistics

Identifies variables

Yes

Measures overall collinearity

Yes

Complexity

Low

High

4. Importance in Econometrics

Detecting multicollinearity is important because it:

• Inflates standard errors

• Makes coefficients unstable

• Produces wrong signs

• Reduces reliability of policy conclusions

Frisch and Farrar–Glauber tests were early but foundational methods that helped

economists understand regression problems in real-world data.

󽆤 Final Conclusion

Frisch’s confluence test and the Farrar–Glauber test are classical econometric methods used

to detect multicollinearity among explanatory variables in regression analysis.

• Frisch test checks whether one independent variable can be explained by others

using auxiliary regressions and R². A high R² indicates multicollinearity.

Easy2Siksha.com

• Farrar–Glauber test provides a formal statistical procedure in three stages: overall χ²

test, individual F-tests, and pairwise t-tests, helping detect existence, sources, and

strength of multicollinearity.

Together, these tests help researchers ensure that independent variables truly provide

unique information, making regression results more reliable and meaningful.

SECTION –D

7.What do you understand by the problem of autocorrelaon?

Discuss in detail the Durbin–Watson test and the remedies of autocorrelaon.

Ans: 󷋇󷋈󷋉󷋊󷋋󷋌 What is Autocorrelation?

In regression analysis, one key assumption is that the error terms (



) are independent of

each other. Autocorrelation occurs when this assumption is violated—meaning the error

term in one period is correlated with the error term in another.

• Definition: Autocorrelation (or serial correlation) is the correlation of a variable with

its own past values.

• Context: It often arises in time-series data, where today’s errors are influenced by

yesterday’s errors.

• Example: In economic data, inflation this year may be influenced by inflation last

year, leading to correlated residuals.

󷊨󷊩 Why is Autocorrelation a Problem?

1. Unbiased but Inefficient Estimates: OLS estimates remain unbiased, but they no

longer have minimum variance.

2. Invalid Hypothesis Testing: Standard errors are underestimated, making t-tests and

F-tests unreliable.

3. Misleading Policy Decisions: In applied research, ignoring autocorrelation can lead

to wrong conclusions.

󷙣󷙤󷙥 The Durbin–Watson Test

The Durbin–Watson (DW) test is the most widely used method to detect autocorrelation in

regression residuals.

Formula:

 

 󰇛













󰇜



 









Where:

Easy2Siksha.com

• 



= residual at time 

• = number of observations

Interpretation:

•  : No autocorrelation

•  : Positive autocorrelation

•  : Negative autocorrelation

Critical Values:

The DW statistic is compared with tabulated lower and upper bounds.

• If  



: Evidence of positive autocorrelation

• If  



: No autocorrelation

• If 



   



: Inconclusive

Example:

Suppose we run a regression and get DW = 1.2. Since this is less than 2, it suggests positive

autocorrelation in the residuals.

󷈷󷈸󷈹󷈺󷈻󷈼 Remedies for Autocorrelation

When autocorrelation is detected, several remedies can be applied:

1. Model Specification Correction

• Sometimes autocorrelation arises because the model is misspecified (e.g., missing

variables, wrong functional form).

• Remedy: Add relevant variables or transform the model appropriately.

2. Transformation of Variables

• Use first differences: Instead of modeling 



, model 











• This often removes serial correlation in time-series data.

3. Generalized Least Squares (GLS)

• GLS adjusts the estimation procedure to account for autocorrelation, producing

efficient estimates.

4. Cochrane–Orcutt Procedure

• A specific iterative method to correct for first-order autocorrelation.

• It estimates the autocorrelation coefficient () and transforms the data accordingly.

5. Newey–West Standard Errors

Easy2Siksha.com

• Even if autocorrelation remains, robust standard errors can be used to make

hypothesis testing valid.

󽆪󽆫󽆬 Critical Reflection

• Autocorrelation is especially common in economic, financial, and environmental

time-series data.

• The Durbin–Watson test is simple and widely used, but it mainly detects first-order

autocorrelation.

• Remedies like GLS and Cochrane–Orcutt are powerful but require careful application.

󷊨󷊩 Conclusion

The problem of autocorrelation arises when regression errors are correlated across time,

violating a key OLS assumption. The Durbin–Watson test provides a practical way to detect

it, with values close to 2 indicating no autocorrelation. Remedies include correcting model

specification, differencing variables, using GLS, or applying robust standard errors.

8.(a) Discuss Koyck approach to distributed lag models.

(b) How is dummy variable model an alternave to Chow test?

Ans: 8(a) Koyck Approach to Distributed Lag Models

󷊆󷊇 The basic idea: Effects don’t always happen instantly

In economics and social sciences, many things don’t affect outcomes immediately. Their

impact spreads over time.

For example:

• If the government increases advertising expenditure, sales don’t jump instantly.

• If interest rates fall, investment increases gradually.

• If rainfall improves, agricultural output rises over several seasons.

This is called a distributed lag effect — where one variable influences another over several

time periods.

So instead of saying:





 







we say:





 



























Easy2Siksha.com

Meaning:

Today’s depends on current , last year’s , the year before, etc.

󷷑󷷒󷷓󷷔 This is called a distributed lag model.

󺡜󺡝󺡞󺡟 The problem with distributed lags

This model looks nice, but in practice it creates big issues:

• Too many lag variables ( 











)

• Multicollinearity (lags are highly correlated)

• Loss of degrees of freedom

• Difficult estimation

So economists wanted a simpler way.

󷈷󷈸󷈹󷈺󷈻󷈼 Enter the Koyck Approach

The Koyck method gives a clever shortcut.

It assumes the lag coefficients decline geometrically over time.

Meaning:



























where

  

So effects fade gradually over time.

󷷑󷷒󷷓󷷔 Example intuition:

An advertisement has the strongest impact today, less tomorrow, even less later.

󷄧󹹯󹹰 How Koyck transforms the model

Start with infinite distributed lag:





































Easy2Siksha.com

Now shift the equation one period back:





 



















Multiply by :





























Now subtract this from the original equation.

Most lag terms cancel out 󷘹󷘴󷘵󷘶󷘷󷘸

Result:





󰇛󰇜















where





 







󷘹󷘴󷘵󷘶󷘷󷘸 Final Koyck Model

So the infinite lag model becomes:

















This is much easier because:

• Only current X is needed

• Plus lagged Y

• No infinite lags

󷷑󷷒󷷓󷷔 This is the Koyck transformation.

󹲉󹲊󹲋󹲌󹲍 Interpretation in simple words

The model says:

Easy2Siksha.com

Today’s outcome depends on

✔ today’s input

✔ yesterday’s outcome

So the past influence is carried through 



󽇐 Advantages of Koyck Approach

• Converts infinite lags into simple model

• Reduces multicollinearity

• Saves degrees of freedom

• Easy estimation

󽁔󽁕󽁖 Limitations

• Assumes geometric decay (may not always hold)

• Error term becomes autocorrelated

• Needs special estimation methods

󼩏󼩐󼩑 Real-life intuition

Think of habit formation:

If someone exercised yesterday, they’re more likely to exercise today.

So:

Today’s exercise =

influence of today’s motivation + yesterday’s exercise

That’s exactly the Koyck idea.

8(b) Dummy Variable Model as an Alternative to Chow Test

Now let’s move to the second part.

󷇓󷇔󷇕󷇖󷇗󷇘 The core question: Are two groups different?

Easy2Siksha.com

Economists often want to test whether two time periods or groups have different

relationships.

Example:

• Before vs after economic reforms

• Rural vs urban markets

• Male vs female wages

• Two countries

We want to know:

󷷑󷷒󷷓󷷔 Is the regression equation the same or different?

󹵍󹵉󹵎󹵏󹵐 Traditional method: Chow Test

The Chow test checks whether regression coefficients differ across groups.

Example:

  









Test if:

• β’s before reform = β’s after reform ?

But Chow test has limitations:

• Needs separate regressions

• Requires equal error variance

• Less flexible

So econometricians use a simpler method:

󷷑󷷒󷷓󷷔 Dummy variable model

󷬭󷬮󷬯󷬰 What is a dummy variable?

A dummy variable is just a 0-1 indicator.

Example:

Easy2Siksha.com

 

󰇥

 after reform

 before reform

󼩺󼩻 Dummy Variable Regression Model

We include dummy variable directly in regression:

  















󰇛󰇜

This single equation captures differences between groups.

󹺔󹺒󹺓 Interpretation of terms

• β₀ → intercept before reform

• β₁ → slope before reform

• β₂ → change in intercept after reform

• β₃ → change in slope after reform

So:

Before reform (D=0):

  









After reform (D=1):

 󰇛







󰇜󰇛







󰇜

󷘹󷘴󷘵󷘶󷘷󷘸 Two regressions in one equation!

󷈷󷈸󷈹󷈺󷈻󷈼 Why dummy variable model replaces Chow test

Instead of running separate regressions and comparing, we estimate:

󷷑󷷒󷷓󷷔 one combined regression

Then test:

• β₂ = 0 → intercept same

Easy2Siksha.com

• β₃ = 0 → slope same

If both zero → no structural change

This is exactly what Chow test checks.

So dummy regression is an alternative.

󽇐 Advantages over Chow Test

• Single regression

• Works with unequal sample sizes

• Allows many groups

• Flexible structural change testing

• Easy hypothesis testing

󷄧󼿒 Final Summary

(a) Koyck approach converts an infinite distributed lag model into a simple regression with

current X and lagged Y by assuming geometrically declining lag coefficients. It reduces

multicollinearity and simplifies estimation but assumes geometric decay and introduces

autocorrelation.

(b) Dummy variable model provides an alternative to the Chow test by estimating a single

regression with dummy and interaction terms to test whether intercepts and slopes differ

across groups or time periods. It is more flexible and practical than separate regressions

used in the Chow test.

“This paper has been carefully prepared for educaonal purposes. If you noce any

mistakes or have suggesons, feel free to share your feedback.”